5 research outputs found
Query Stability in Monotonic Data-Aware Business Processes [Extended Version]
Organizations continuously accumulate data, often according to some business
processes. If one poses a query over such data for decision support, it is
important to know whether the query is stable, that is, whether the answers
will stay the same or may change in the future because business processes may
add further data. We investigate query stability for conjunctive queries. To
this end, we define a formalism that combines an explicit representation of the
control flow of a process with a specification of how data is read and inserted
into the database. We consider different restrictions of the process model and
the state of the system, such as negation in conditions, cyclic executions,
read access to written data, presence of pending process instances, and the
possibility to start fresh process instances. We identify for which facet
combinations stability of conjunctive queries is decidable and provide
encodings into variants of Datalog that are optimal with respect to the
worst-case complexity of the problem.Comment: This report is the extended version of a paper accepted at the 19th
International Conference on Database Theory (ICDT 2016), March 15-18, 2016 -
Bordeaux, Franc
Scaling Data Science Solutions with Semantics and Machine Learning: Bosch Case
Industry 4.0 and Internet of Things (IoT) technologies unlock unprecedented
amount of data from factory production, posing big data challenges in volume
and variety. In that context, distributed computing solutions such as cloud
systems are leveraged to parallelise the data processing and reduce computation
time. As the cloud systems become increasingly popular, there is increased
demand that more users that were originally not cloud experts (such as data
scientists, domain experts) deploy their solutions on the cloud systems.
However, it is non-trivial to address both the high demand for cloud system
users and the excessive time required to train them. To this end, we propose
SemCloud, a semantics-enhanced cloud system, that couples cloud system with
semantic technologies and machine learning. SemCloud relies on domain
ontologies and mappings for data integration, and parallelises the semantic
data integration and data analysis on distributed computing nodes. Furthermore,
SemCloud adopts adaptive Datalog rules and machine learning for automated
resource configuration, allowing non-cloud experts to use the cloud system. The
system has been evaluated in industrial use case with millions of data,
thousands of repeated runs, and domain users, showing promising results.Comment: Paper accepted at ISWC2023 In-Use trac
Literal-Aware Knowledge Graph Embedding for Welding Quality Monitoring: A Bosch Case
Recently there has been a series of studies in knowledge graph embedding
(KGE), which attempts to learn the embeddings of the entities and relations as
numerical vectors and mathematical mappings via machine learning (ML). However,
there has been limited research that applies KGE for industrial problems in
manufacturing. This paper investigates whether and to what extent KGE can be
used for an important problem: quality monitoring for welding in manufacturing
industry, which is an impactful process accounting for production of millions
of cars annually. The work is in line with Bosch research of data-driven
solutions that intends to replace the traditional way of destroying cars, which
is extremely costly and produces waste. The paper tackles two very challenging
questions simultaneously: how large the welding spot diameter is; and to which
car body the welded spot belongs to. The problem setting is difficult for
traditional ML because there exist a high number of car bodies that should be
assigned as class labels. We formulate the problem as link prediction, and
experimented popular KGE methods on real industry data, with consideration of
literals. Our results reveal both limitations and promising aspects of adapted
KGE methods.Comment: Paper accepted at ISWC2023 In-Use trac
An ASP approach to query completeness reasoning
We address the problem to determine whether a query over a partially complete database can be answered completely, which arises in data integration and decision support. Using so-called table completeness statements, one asserts which parts of a database are complete. The question then is whether these are sufficient to retrieve the same answers as if the database had complete information about the domain of application. Previous work in the area of databases has characterized the complexity of the problem, but did not come up with a practical implementation.
In this paper we explore ASP engines as a possible platform to execute completeness reasoning problems. We first generalize the problem by taking into account finite domain constraints and then translate it into rules that may have disjunctions in the heads. The translation allows us to encode completeness problems into cautious reasoning in ASP. We implemented our encoding in two state of the art solvers and tested it on examples that involve many disjunctions, but allow for significant optimizations. It turned out that both engines did not take advantage of the possibilities for optimization.status: publishe
Implementing query completeness reasoning
Data completeness is commonly regarded as one of the key aspects
of data quality. With this paper we make two main contributions: (i)
we develop techniques to reason about the completeness of a query
answer over a partially complete database, taking into account con-
straints that hold over the database, and (ii) we implement them
by an encoding into logic programming paradigms. As constraints
we consider primary and foreign keys as well as finite domain con-
straints. In this way we can identify more situations in which a
query is complete than was possible with previous work. For each
combination of constraints, we establish characterizations of the
completeness reasoning and we show how to translate them into
logic programs. To deal with the case when a query is incomplete,
we compute a more general query, which contains all answers of
the original query, but that is complete. As a proof of concept we
ran our encodings against test cases that capture characteristics of
a real-world scenario.status: publishe